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We propose a time-space discretization scheme for quasi-linear 
parabolic PDEs. The algorithm relies on the theory of fully cou- 
pled forward-backward SDEs, which provides an efficient probabilis- 
tic representation of this type of equation. The derivated algorithm 
holds for strong solutions defined on any interval of arbitrary length. 
As a bypass product, we obtain a discretization procedure for the un- 
derlying FBSDE. In particular, our work provides an alternative to 
the method described in [Douglas, Ma and Protter (1996) Ann. Appl. 
Probab. 6 940-968] and weakens the regularity assumptions required 
in this reference. 

1. Introduction. Introduced first by Antonelli [1] and then by Ma, Prot- 
ter and Yong [14], forward-backward stochastic differential equations (FB- 
SDEs in short) provide an extension of the Feynman-Kac representation to 
a certain class of quasi-linear parabolic PDEs. These equations also appear 
in a large number of application fields such as the Hamiltonian formulation 
of control problems or the option hedging problem with large investors in 
financial mathematics (i.e., when the wealth or strategy of an agent has an 
impact on the volatility). We refer to the monograph of Ma and Yong [15] 
for details and further applications. 

1.1. FBSDE theory and discretization algorithm. 

Connection between FBSDEs and quasi-linear parabolic PDEs. Consider 
a probability space (0,.^, P) endowed with a d-dimensional Brownian mo- 
tion (-Bt)jg[Q , where T denotes an arbitrarily prescribed positive real. For 
a given initial condition xq^W^, a forward-backward SDE strongly couples 
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a diffusion process U to the solution {V, W) of a backward SDE (as defined 
in the earlier work of Pardoux and Peng [20]): 



yt£[0,T] Ut = xo+ f\{Us,Vs,Ws)ds+ f\{Us,Vs)dBs, 

Jo Jo 

Vt = H{UT)+ r fiUs,V,,W,)ds- rWsdBs. 
Jt Jt 



In this paper the coefficients b, f, a and H are deterministic (and, for 
simplicity, also time independent). In this case, Ma, Protter and Yong [14], 
Pardoux and Tang [21] and Delarue [6] have investigated in detail the link 
with the following quasi-linear PDE on [0,T[ x W^: 

dtu{t, x) + {b{x, u{t, x),Vxu{t, x)a{x, u{t, x))),S/xu{t, x)) 
+ ^tr{a{x, u{t, x))Vl u{t, x)) 

+ fix, u{t, x),Vxu{t, x)a{x, u{t, x))) = 0, 
u{T,x) = H{x), 
with a{x,y) = {aa*){x,y), {x,y) G M'^ x M. 

A probabilistic numerical method for FBSDEs and quasi-linear PDEs. 
This paper aims to derive from the probabilistic theory of FBSDEs a com- 
pletely tractable algorithm to approximate the solution of equation [E). As 
a bypass product, the procedure also provides a discretization of the triple 
{U,V,W). 

Most of the available numerical methods proposed so far are purely ana- 
lytic and involve finite-difference or finite-element techniques to approximate 
the solution u of {£). For example, the discretization procedure for FBSDEs 
of type {E), given in [10], consists in discretizing first the PDE {£) and then 
in deriving an approximation of the underlying FBSDE. 

At the opposite, we propose in this paper to derive from the FBSDE rep- 
resentation a numerical scheme for quasi-linear equations of type {£). This 
strategy finds its origin in the earlier work of Chevance [5], who introduced 
a time-space discretization scheme in the decoupled or so-called "pure back- 
ward" case. In this latter frame, the coefficients b and a do not depend on 
V and W and the forward equation reduces to a classical SDE. The process 
U then appears as an "objective diffusion." Note in this particular case that 
the time-space discretization scheme and the specific form of the system [E) 
permit to use a standard "dynamic programming principle." 

From a numerical point of view, two other kinds of approaches have been 
developed in the backward case. The first one is based on Monte Carlo 
simulations and Malliavin integration by parts; see [4]. The other one relies 
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on quantization techniques for a discretization scheme of the underlying 
forward equation. Quantization consists in approximating a random variable 
by a suitable discrete law. It provides a cheap and numerically efficient 
alternative to usual Monte Carlo methods to estimate expectations. In the 
works of Bally and Pages [2] or Bally, Pages and Printems [3] on American 
options, the key idea is to perform an optimal quantization procedure of a 
discretized version of the underlying diffusion process in order to compute 
once for all by a Monte Carlo method the corresponding semi-group. Then, 
the second step consists in doing a dynamic programming descent. For other 
applications of quantization, we refer to the works of Pages, Pham and 
Printems [18] or Pages and Printems [19]. 

Discretization strategy. In the coupled case, or quasi-linear framework, 
the diffusion U is not "objective" anymore. Indeed, due to the strong nonlin- 
earity of the equation (iS), the coefficients of the underlying forward diffusion 
depend on the solution and on its gradient. 

In particular, we cannot quantify a discretization scheme of the diffusion 
process as explained above. This is well understood: without approximat- 
ing u, we do not have any a priori knowledge of the optimal shape of the 
associated grid. Hence, we just focus on the quantization of the Brownian 
increments appearing in the forward SDE and then choose to define the 
approximate diffusion on a sequence of truncated d-dimensional Cartesian 
grids. Note that the discretization procedure of U is now coupled to the ap- 
proximation procedure of {u, V xu) [denoted in a generic way by (n, v)] which 
is computed along the same sequence of grids. The time-space discretization 
scheme allows to define [u^v) and the approximations of the transitions of 
U in order to recover a kind of "dynamic programming principle." Consider 
indeed a given regular time mesh (tj = «MiG{o,...,Af} of [0)^]i ^ being the 
step size. To every discretization time ti, associate a spatial Cartesian grid 
d = {(4)fcGXi}> C N*, such that Vz G {0, . . . , - 1}, Cj C Ci+i. Start- 
ing from tN = T for which the solution of {£) and its gradient are known, 
the transition of U from ti to ti+i, i G {0, . . . ,N — 1}, is then updated itera- 
tively through the Brownian quantized increments and through the values of 
u(ti+i, •) and v{ti+i, •) on the grid Cj+i. This permits to express the approx- 
imation u{ti,-) through a discretized version of the Feynman-Kac formula. 

At this stage, it remains to specify the way we update the approximation 
of the gradient of the solution u. We mention actually that the strategy 
aims to approximate the product 'SJxu{tk, ■)a{-,u[tj^, •)) instead of V xu{tk, •) 
itself. This explains the specific writing of the PDE {£). We then proceed 
in two different steps. A first approximation is performed through a mar- 
tingale increment procedure as done in the discretization scheme of BSDEs 
explained in [4], or as used in [3]. A second step consists in quantizing the 
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Gaussian increments appearing in the former representation. This is an al- 
ternative solution to the usual techniques based on Monte Carlo simulations 
or on Malliavin integration by parts as employed in [4]. Of course, if the 
matrix aa* is nondegenerate, the strategy still applies, up to an inversion 
procedure, to coefficients of the form {b, f){x,u{t,x),Vxu{t,x)). 

Extra references. Some of the preliminaries of our approach can be found 
in [17] in the specific case where {b,f){x, u{t, x) ,V xu{t, x)a{t, x,u{t, x))) re- 
duces to {b, f){x,u(t,x)). Note, however, that the proof of the convergence 
of the underlying numerical scheme proposed in this reference just holds 
for so-called "equations with small parameter" (i.e., with a small diffusion 
matrix). Generally speaking, the authors have then to control the regular- 
ity properties of the solution of the transport problem associated to the 
equation {£) [i.e., the same equation as {£), but without any second-order 
terms]. Without discussing in detail the basic assumptions made in our pa- 
per, note that no condition of this type appears in the sequel: in particular, 
the matrix a is assumed to be uniformly elliptic. Hence, we feel that the 
work of Milstein and Tretyakov [17] applies to a different framework than 
ours. For this reason, we avoid any further comparisons between both sit- 
uations. Add finally, for the sake of completeness, that Makarov [16] has 
successfully applied the strategy of Milstein and Tretyakov [17] to the case 
{b, f) = (b, f){x,u{t,x),'Vxu{t,x)a{t,x,u{t,x))) under suitable smoothness 
properties on the coefficients. Of course, the small parameter condition is 
then still necessary. 

1.2. Novelties brought by the paper. 

A purely probabilistic point of view. The proof of the convergence of 
our algorithm is somehow the first to be essentially of probabilistic nature, 
since we are able to adapt the usual stability techniques of BSDE theory 
to the discretized framework. Note, in particular, that we follow the proof 
of uniqueness in the four step scheme given in [14] to handle the strong 
coupling between the forward and backward components. 

In the discretized framework, the gradient terms appearing in b and / 
bring additional difficulties. Indeed, our gradient approximation does not 
appear as a representation process given by the martingale representation 
theorem as the process W in (E). In particular, the strategy introduced by 
Pardoux and Peng [20] to estimate the norm of W over [0,T] fails in 
the discretized setting. We then propose a specific probabilistic strategy to 
overcome this deep trouble and thus to handle the nonlinearities of order 
one, see Sections 3.3 and 9.3 for details. 
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Convergence under weak assumptions. In [10], the authors handle the 
gradient terms by working under smoothness assumptions that allow them 
to study the gradient of u as the solution of the differentiated PDE. 

Our strategy permits to avoid to differentiate the PDE and thus to really 
weaken the assumptions required both on the coefficients of (E) and on the 
smoothness of the solution u of {£) in the above reference. In the previous pa- 
per, the coefficients are assumed to be smoothly differentiable and bounded. 
We just suppose that they are Lipschitz continuous and bounded in x. In [10] , 
the solution u of {£) is at least bounded in C2+°/2'4+«([o, T] x M"^), a G ]0, 1[. 
In our paper we only impose u to belong to C^'2([0,T] x R'^) with bounded 
derivatives of order one in t and one and two in x. 

A completely tractable algorithm. Furthermore, in [10], the authors al- 
ways take into consideration the case of infinite spatial grids. This turns out 
to be simpler for the convergence analysis, anyhow it does not provide in 
all generality a fully implementable algorithm. We discuss the impact of the 
truncation of the grids and analyze its contribution in the error. 

Finally, a linear interpolation procedure is also used in [10] to define the 
algorithm. This can be heavy in large dimension. The algorithm we propose 
allows to define the approximate solution only at the nodes of the spatial 
grid. In this way, we feel that our method is simpler to implement and 
numerically cheaper. Note, moreover, that we avoid the inversion of large 
linear systems associated to "usual" numerical analysis techniques. 

1.3. Organization of the paper. In Section 2 we detail general assump- 
tion and notation, as well as several smoothness properties of the solution 
u of {£). We also specify the connection between the FBSDE (E) and the 
quasi- linear PDE {£). Section 3 explains the main algorithmic choices. We 
present, in particular, the various steps that led us to the current discretiza- 
tion scheme. The main results are stated and discussed in Section 4. In 
particular, we give an estimate of the speed of convergence of the algorithm. 
As a probabilistic counterpart, we estimate the difference between the ap- 
proximating processes and the initial solution (U,V,W) of (E). Numerical 
examples are presented in Section 5. 

The end of the paper is then mainly devoted to the proof of the conver- 
gence results. The proof is divided into three parts. Various a priori controls 
of the discrete objects are stated and proved in Section 6. In Section 7 we 
adapt the FBSDE machinery to our setting to prove a suitable stability 
property. Section 8 is then devoted to the last step of the proof and, more 
precisely, to a specific refinement of Gronwall's lemma. In order to be con- 
cise, we sometimes only sketch the proofs. They are presented in detail in 
the electronic version Delarue and Menozzi [9]. 
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As a conclusion, we compare in Section 9 our strategy to other methods 
and explain some technical points that motivated the choice of our current 
algorithm. We also indicate further conceivable extensions. 

2. Nonlinear Feynman Kac formula. In this section we first give the 
assumptions on the coefficients of the FBSDE and then briefly recall the 
connection with quasi-linear PDEs. As detailed later, under these assump- 
tions, the underlying PDE admits a unique strong solution, whose partial 
derivatives of order one in t and one and two in x are controlled on the whole 
domain by known parameters. For the sake of simplicity, we also assume that 
the coefficients do not depend on time. 

2.1. Coefficients of the equation. For a given d G N*, we consider the 
coefficients 6 : M'' x M x M'^ ^ M-^*, / : M'^ x M x M'^ ^ M, a : M'' x M ^ M'^^'^, 

Assumption (A). We say that the functions 6, /, H and a satisfy As- 
sumption (A) if they are bounded in space and have at most linear growth 
in the other variables, are uniformly Lipschitz continuous w.r.t. all the vari- 
ables, a = aa* is uniformly elliptic and H is bounded in C^"*""(M'^). 

From now on, Assumption (A) is in force. 

2.2. Forward-backward SDE. Consider now a given T > and a prob- 
ability space (O,.?-", P) endowed with a Brownian motion {Bt)o<t<T whose 
natural filtration, augmented with P null sets, is denoted by {J^t}o<t<T- 

Fix an initial condition xq G and recall (see [14] and [6]) that there 
exists a unique progressively measurable triple (U,V,W), with values in 

X M X M*^, such that Esup4g[o,T](|t^fP + l^fP) < +oo, E Jq iVFfpdt < +oo, 
and which satisfies P almost surely the couple of equations (E). 

2.3. Quasi-linear PDE. Thanks to [13], Chapter VI, Theorem 4.1, and 
to [14] (up to a regularization procedure of the coefficients), we claim that 
{£) admits a solution u £ C^''^{[0,T] x ]R'^,]R) satisfying the following: 

Theorem 2.1. There exists a constant C2 f, only depending on T and 
on known parameters deriving from Assumption (A), such that V(t, x) S 

[o,r] X w^, 

\u{t,x)\ + \Vxu{t,x)\ + |V^^^n(t,x)| + \dtu{t,x)\ 

+ sup [\t-t'\~'^^^\V^u{t,x)-V:,u{t',x)W<C2 1- 

Moreover, u is unique in the class of functions u £ C{[0,T] xW^,R)r\C^'^{[0, 
T[xE'^,M) which satisfy sup(^^^,j.-j^^Q^j^[s^^d{\u{t,x)\ -\-\Vxu{t,x)\) < +00. 
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From [6, 14, 21], the FBSDE {E) is connected with the PDE {£). Set 
V {t, x) G [0, r[ X M'^, v{t, x) = V xu{t, x)a{x, u(t, x)). The relationship between 
(E) and {£) can be summed up as follows: Vt S [0,T], 

Vt = u{t,Ut), Wt = vit,Ut), 



2.1 



Vt = E[VT\J't]+K 



fiUs,Vs,W,)ds 



3. Approximation procedure. In this section we detail the construction 
of the approximation algorithm of the solution u of {£). We explain how the 
final form of the discretization procedure can be derived step by step from 
the forward-backward representation {E). We also present the quantization 
techniques used in order to compute expectations related to Brownian in- 
crements and we discuss the choice of the underlying spatial grids which 
appear in the approximating scheme. 

3.1. Rough algorithms. 

Localization procedure. Recall from the Introduction that the forward- 
backward equation {E) appears as the starting point of our discretization 
procedure. Indeed, this couple of stochastic equations provides a probabilis- 
tic representation of the quasi-linear PDE {£) and summarizes in an integral 
form the local evolution of the solution u. Define now, for a given integer 

> 1, a regular mesh of [0,r] with step h = T/N, that is, set t^ = kh, 
\/k G {0, ...,A'"}. Writing the local evolution of (E) and conditioning by 
Utf, =xeR'^, wededuce VA;e{0,...,iV-l}, 

(3.1) C/*^f^ =x + f^'^" b{Ul'^'-, ds + j^" aiUl"^^, dB, 



and 



t-k+l 



+ 



E 



-fc+i 



W^f*'^ ds 



where the superscript {tk,x) denotes the starting point of the diffusion pro- 
cess U. The remaining term 0{h^^^) is a consequence of Assumption (A), 
(2.1) (relationships between V,W and u) and Theorem 2.1 (boundedness of 
u and Vxu)- Relation (2.1) also yields 

ft 



u{tk,x) =E 



(3.2) 



/''k-\-l 



E 



Wj^'^ ds 



nu{tk+i,u!',zm,^, - Bt^)]+o{h^/'). 
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In the following the Brownian mcrement -B^^.^^ — Bt^, is denoted by AB^. 
In particular, we derive from the above relation that, neglecting the rest, 
the best constant approximation of (^i'"^)sG[tfe,tj._,_i] in the L^([tfc,tfc+i] x 
O, ds (8) dP) sense is given by 

(3.3) W^^'^ ^ h~'nu{tk+i, f/t*^f,)Ai?^]. 

Relationships (3.1), (3.2) and (3.3) provide a rough background to discretize 
the FBSDE (E). However, this first form is not satisfactory from an algo- 
rithmic point of view. Indeed, because of the strong coupling between the 
forward and the backward equations, the transition of the diffusion depends 
on the solution itself, both in the drift term and in the martingale part. 
At the opposite, in the so-called "pure backward" case, or, correspondingly, 
for semi-linear equations, the underlying operator does not depend on the 
solution. In such a case, the classical Euler machinery applies to discretize 
the decoupled diffusion U. 

Induction principle. Recall that similar difficulties occur to establish the 
unique solvability of the FBSDE (E). In [6] the first author overcomes the 
strong coupling between the forward and backward equations by solving by 
induction the local versions of {E) on [tk,tk+i], k running downward from 
A^ — 1 to 0. By analogy with this approach, the discretization procedure of the 
forward component on a step [tk,tk-\-i[, <k < N —1, must take into account 
the issues of the former local discretizations of the backward equation and, 
more specifically, the approximations of M(tfc+i,-) and t'(tfc+i,-). 

Predictors. Assume to this end that, at time tfc+i, some approximations 
u{tk+i,-),^i'tk+ir) of ii(tfc+ir)i ■y(*fc+ir) are available on the whole space. 
These approximations appear as the "natural" predictors of the true solu- 
tion and of its gradient on [tk,tk+i[. Introducing the forward approximating 
transition 

(3.4) T{tk,x) = b{x, u{tk+i,x),v{tk+i,x))h + a{x, u{tk+i,x))AB'', 
we derive an associated updating procedure by setting 

u{tk,x) = E[u{tk+i,x + T{tk, x))] + hf{x, u{tk+i,x),v(tk+i,x)), 

(3.5) 

vitk,x) = h~^E[u{tk+i,x + T {tk,x))AB^]. 

Once the predictors are updated, the procedure can be iterated. Of course, 
at time T = tTv, we set u{tN, •) = H{-) and v{t]y, •) = ^xH{-)a{-, H{-)). Note, 
in particular, that the expectations appearing in (3.5) are correctly defined. 
Indeed, a simple induction procedure shows from Assumption (A) that u 
and V are bounded on {to, . . . ,ti\f} x M'^ (but the bound depends on the 
discretization parameters) . 
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Spatial discretization. To obtain a numerical scheme, the most natural 
strategy consists in defining the approximations u{tk,-) and v{ti^,-) of the 
true solution and its gradient on a discrete subset of M'^. Those approxi- 
mations could then be extended to the whole space with a linear interpo- 
lation procedure. However, in high dimension, this last operation can be 
computationally demanding. We thus prefer, for simplicity, to restrict the 
approximations to a given spatial grid = {(xj^)jgij,,Xfc C N*} C W^, for 
k G {0, . . . , A^}. This choice imposes to modify (3.5). Indeed, the "terminal" 
value X + T{tk,x) must belong to the former grid Cfc+i- 

Hence, denoting by H^+i a projection mapping on the grid Ck+i, we re- 
place (3.5) by, Vx G Ck, 

u{tk, x) = E[u(ifc+i, nfc+i(x + T{tk, x)))] 

(3.6) + hf{x,u{tk+i,x),v{tk+i,x)), 

vitk,x) = h~^E[u{tk+i,Ilk+i{x + T{tk,x)))AB'']. 

In the following, we suppose that V (i, j) G {0, . . . , N}'^, j < i ^ Cj C Ci, so 
that u(tfc_|_i, x), u(tfc+i, x) are well defined for x £ Ck- Note that, if the car- 
dinal of Ck is finite for every k, the above scheme is already implementable 
up to the computations of the underlying expectations. 

Global updating. The use of the predictors u{tk+i, ■),v{tk+i, •) is an alter- 
native to the standard fixed point procedure. This latter consists in giving 
first some global predictors u^(tk,-)T^^{'tkr)i k £ {0, . . . ,N}. These are used 
to compute the transitions of the approximating forward process. In this 
way, we obtain a decoupled forward-backward system, whose solution may 
be computed by a standard dynamic programming algorithm. A complete 
descent of this algorithm from k = N to k = produces ti^(tfc, •), u^(tfc, •), 
k G {0, . . . , A^}, from which we can iterate the previous procedure. In this 
frame, the underlying distance used to describe the convergence of the 
fixed point procedure involves all the discretization times and all the spatial 
points. This strategy appears as a "global updating" one. 

From a numerical point of view, this seems unrealistic. Indeed, one would 
need to solve a large number of linear problems. This would either require 
to use massive Monte Carlo simulations at each step of the algorithm or 
to apply, again at each step of the algorithm, a quantization procedure of 
the approximate diffusion process associated to the current linear problem. 
Furthermore, it seems intuitively clear that a local updating is far more 
efficient than a global one. 

3.2. Quantization. 

Expectations approximation. Two methods are conceivable to compute 
expectations appearing in (3.6). 
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The first one consists in applying the classical Monte Carlo procedure 
for every k S {0,...,A^ — 1} and for every x £ and, therefore, to re- 
peat this argument J2k=o l-^^l times. Such a strategy would lead to perform 
^k=o l-^*:! ^ ^MC elementary operations to compute underlying expectations 
up to the error term emc- This approach seems rather hopeless. 

A more efficient method consists in replacing the Gaussian variables ap- 
pearing in (3.6) by discrete ones with known weights. This procedure is 
known as "quantization." Consider to this end a probability measure on 
with finite support (?/i)ie{i,...,A/} and denote by (pi)ie{i,...,A^} associated 
weights. Replace then the Gaussian distribution in (3.6) by this law. For a 
given X £Ck, < k < N , the expectations appearing in the induction scheme 

(3.6) then write as computable finite sums. 

Quantization principle. Generally speaking, for a given random variable 
A G f]p^iLP{F), the quantization procedure consists in replacing A by its 

projection on a finite grid A(M) = {(yi)ie{i....,J\/}} C M'^, M G N*. In order to 
measure the error associated to the grid A(M), we introduce the so-called 
"p-distortion" : Da,p{MM)) = ||A - Ga(m)(A)||lp(p), P > 1, where Ga(m) 
denotes the projection mapping on A(M). We refer to the monograph of 
Graf and Luschgy [11] for details. 

Optimal grids. The crucial step therefore lies in the choice of the grid. 
The Bucklew-Wise theorem (see Theorem 6.2, Chapter II in [11] for details) 
then gives, for A*(M) achieving the minimum in the p-distortion, 

(3.7) MP^'^Dl p{A*{M)) C{p,d) as +0O, 

where C{p,d) is a constant depending on p,d and the variable at hand. 

Various algorithms are available to compute an optimal grid A*(M), see, 
for instance, [2]. We also recall that, for d> 1, the optimal grid is not unique. 

Up to a rescaling, the basic object associated to Brownian increments 
is a d-dimensional standard normal random variable. Hence, we assume in 
the following that a grid A(M) for A~AA(0,Irf), as well as the associated 
weights (pi)ig{i,...,M}) a,re given and "perfectly" computed. 

Quantized algorithm. We are now in position to introduce a more tractable 
induction principle. Set to this end, for all k G {0, ...,N— 1}, g{AB^) = 
h^^'^G^M){h~^''^^B^)- Note from the electronic version [9] that, w.l.o.g., 
for every p > 1, there exists a constant Cquantiz (p, t^) such that 

(3.8) nai^B^) - Ai?'=|P]VP < CQ,antiz(p,rf)/i'/'M-V'^. 
Turn now (3.4) and (3.6) into 

(3.9) T{tk,x) = b{x, u{tk+i,x),v{tk+i,x))h + a{x, u{tk+i,x))g{AB'') 
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and 



(3.10) 



u{tk,x) = K[uitk+i,Uk+i{x + T{tk,x)))] 
+ hf{x,u{tk+i,x),v{tk+i,x)), 
v{tk,x) = h~^E[u{tk+i,Uk+i{x + r {tk,x)))g{AB% 



To sum up our strategy, the use of predictors allows us to recover a kind of 
standard dynamic programming principle. The quantization gives an easy, 
cheap and computable algorithm. 

3.3. Algorithm. For technical reasons detailed in Section 9, we consider 
for the convergence analysis a slightly different version of the above algo- 
rithm. Namely, we need to change, at a given time t^, the discretization of 
b and / and, in particular, to replace v{tk+i,-) by a new predictor. Con- 
cerning the driver of the BSDE, we replace f{x,u(tk+i,x),v(tk+i,x)) by 
f{x,u{tk+i,x),v{ti~,x)): the definition of v{tk,x) does not involve u(tk,x). 

The story is rather different for b. Indeed, the definition of v{tk,x) relies 
on the choice of the underlying transition. In particular, putting v{tk,x) in 
b as done in / would lead to an implicit scheme. 

Nevertheless, for a given intermediate predictor v{tk,-) of we can 

put 

T{tk,x) = b{x, u{tk+i,x),v{tk,x))h + a{x, u(tk+i,x))g{AB''). 

The whole difficulty is then hidden in the choice of v{tk,x). Our strategy 
consists in choosing v{tk,x) as the expectation of v{tk+i,-), with respect 
to the transition T^{tk,x) = a{x,u{tk+i,x))g{AB''). This transition differs 
from T{tk,x) in the drift b and leads to an explicit scheme. Namely, we set 



The predictor ^(tfc, •) appears as a "regularized" version of v{tk+i, •)• Thanks 
to a Gaussian change of variable, the laws of the underlying transitions 
rO(tfc,x) and T{tk , x) can be compared; see [9], Section 7.3, for details. 

Final algorithm. 



(3.11) 



■k,x) = E[v{tk+i,Uk+i{x + T 



{tk,xm. 



Algorithm 3.1. The final algorithm writes 



Vx G Ctv, n(T, x) = H{x), 
\fke{0,...,N -1}, VxGCfc, 



v{T,x) = V^H{x)a{x,H{x)) 
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= a{x,u{tk+i,x))g{AB''), 

= E[v{tk+i,Uk+iix + T\tk, x)))], 

= b{x, u{tk+i,x),v{tk,x))h + a{x, u(tk+i,x))g{AB''), 

= h-^E[u{tk+i,Uk+iix + T{tk, x)))giAB'')], 

= K[u(tk+i,Ilk+i{x + T{tk,x)))] + f{x, u{tk+i,x),v{tk,x))h. 



A discrete probabilistic representation. Following the link between {E) 
and {£), define, for xq G Cq, a Markov process on the grids {Ck)o<k<N ac- 
cording to the transitions {T{tk,x))ke{o,...,N-i},xeCt:i 

(3.12) Xo = xo, yke{0,...,N-l}, Xt^^^=Uk+iiXt^+T{tk,Xt^)). 
Referring to the connection between U and (y, VF) [see, e.g. (2.1)], put now 

(3.13) VfcG{0,...,iV}, Yt,=u{tk,Xt,), Zt^=vitk,Xt,). 

Note that Y and Z are correctly defined since Xt^. belongs to the grid Ck- 
The couple {Y,Z) appears as a discrete version of the couple (V,M^) in (E). 
More precisely, one can prove the following discrete Feynman-Kac formula: 
VO<A;<iV-l, 



(3.14) Yt, 



■■E 



N 



H{XtJ + h f{Xt^_„uiti,Xt^_,),Zt^_,) 

i=k+l 



Note anyhow that the process Z does not appear as the martingale part of 
the process Y . However, thanks to the martingale representation theorem, 
there exists a progressively measurable process Z, with finite moment of 
order two, such that 



(3.15) 



JL ftN_ 

yt^ + hY.f{Xt,^,MU,Xt^^,),Zt,_,)=Yo + / ZsdBs. 



Of course, the process Z does not match exactly the process Z. However, for 
a given k E {0, . . . , — 1}, it is readily seen from the above expression that 
the best .7^^^. -measurable approximation of {Zs)se[tf.,tk+i] in L'^{[tk,tk+i] x Q, 
ds(iS)dF) is given by h^^'E[Yt^,^^AB''\Ttk]- Up to the quantization procedure, 
this term coincides with v{tk,Xt^). In other words, the processes Z and Z 
may be considered as close. 

3.4. Choice of the grids. Because of the strong coupling, little is a priori 
known on the behavior of the paths of the forward process. Hence, we cannot 
compute a kind of optimal grid for X. The most natural choice turns out to 
be the one of Cartesian grids. 
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Unbounded Cartesian grids. Two different choices of grids are conceiv- 
able. First, we can treat the case of infinite Cartesian grids: V fc E {0, . . . , N}, 
Ck = Coo, Coo = where 6 > denotes a spatial discretization 

parameter. In this case, the projection mapping writes Vrc E R'^, 1100(2;) = 
J2yeCoo [yU.j=i M-S/2,5/2[ixj - Vj)]- In other words, for every j E {1, . . . , d}, 
the coordinate j of 1100(2;) is given by {Iloo{x))j = 6[6~^Xj + 1/2J . 

This choice actually simplifies the convergence analysis and allows a di- 
rect comparison with the results from the existing literature; see [10]. Note, 
however, that it does not provide a fully implementable scheme since the set 
Coo is infinite. 

Truncated grids. Several truncation procedures may be considered, but 
all need to take into account the specific geometry of a nondegenerate diffu- 
sion, or, more simply, of the Brownian motion. Set, for example, for a given 
R>0, and for alH E {0, . . . , N}, Ci = Coo n Aj, where 

Ai = {xeR'^,'il<j<d, 

(3.16) 

-5[{R + pm))6~^\ - 5/2 < xj <6[{R + pi'{ti))5-^\ + 5/2}, 

where ^l^{t) = flj^^g}' V ^ [0> l/^); is meant to take into account the Holder 
regularity of the Brownian path. The larger is rj, the smaller is the number 
of points involved in the discretization procedure. However, since the proof 
of the convergence of the algorithm is far from being trivial, we restrict our 
analysis to the case rj = 0. 

Note also that the particular choice of the bounds in the definition of 
Aj ensures that for all x E M'^, Hoo(a;) E Cj 44^ x E Aj. Hence, for every i E 
{0,...,N},Ui writes 

VO<i<iV, VxEA,, Ui{x) = Q{R + p,Uoo{x)) = Uoo{x), 

(3.17) Vl<i<iV, Vx^ A„ Ui{x) = Q{R + p,Uoo{x)), 

Vx^Ao, no(x) = Q(fl,noo(x)), 

where, for a given {r,y) E M.~^* x M'^, Q{r,y) denotes the orthogonal projec- 
tion of ?/ on the hypercube [— (^[r^""*^] , 5[r(5~"'^J]'^ : Q(r, y) = {{yiV {—5[r5~^\ )) A 
{5[r5~^ \))i<ci<d- Note finally that R is fixed by the reader once for all in 
function of the set on which u has to be approximated at the initial time. At 
the opposite, p appears as a discretization parameter chosen by the reader 
in function of the required precision and of the affordable complexity for 
Algorithm 3.1. 

4. Convergence results. This section is devoted to the convergence anal- 
ysis of u to u. As stated in the following theorem, which is the main result 
of the paper, five different types of errors can be distinguished: 
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Theorem 4.1. Let p > 2. There exist two constants f and ^, only 
depending onp, T and on known parameters deriving from Assumption (A), 
such that, for h < c^ 8'^ <h, M~'^l'^ < h and p>l, 

sup \u{0,x) - u{0,x)\^ <C> ^f^(global), 

with (global) = £^{time) + <S^ (space) +f^(trunc) +<S^(quantiz) +<5^ (gradient, 
p) and <f (time) = h^^"^, <5(space) = h~^5, f (trunc) = R/{R+p), <5(quantiz) = 
^_i/2^-i/d^ ^(gradient,p) = /jP/2+rf/4-i/2^j-p/d^-p-d/2_ 

Remark 4.1. The FBSDE counterpart of Theorem 4.1 is given in Sec- 
tion 4.3: see Theorems 4.2 and 4.3. 

4.1. Classification of errors. We now detail the meaning of the different 
errors appearing in Theorem 4.1: 

Temporal discretization error £^(time). The 1/2 exponent appearing in 
the definition of £^(time) corresponds to the Holder regularity of u and V^u 
in time and to the L^(P) 1/2-Holder property of the Brownian increments. 

Spatial discretization error f (space). This quantity highly depends on 
the ratio between the spatial and the temporal steps. This connection be- 
tween 6 and h can be explained as follows: the drift part of the transitions 
{T{tk, ■))o<k<N is of order h and the diffusive one is of order /i^/^. Thus, 
to take into account the influence of the drift at the local level, the spatial 
discretization parameter must be smaller than h. In other words, 6h~^ must 
be small. 

Quantization error iS(quantiz). This error depends on the ratio between 
the distortion and the temporal step. The quantity <?(quantiz) represents the 
typical bound between v{tk,Xt,,) and the best measurable approxima- 
tion of the process {Zs)se[t^.,t^.+l]l that is, between v{tk,Xtj,) and h~^'E[u{tk+i, 
I\.k+i{Xt^ +T{tk,Xt^)))/S.B^\Tn,]- Note, indeed, that the distance between 
AS^ and g{/^B^) is of order h^l'^M-'^/'^ , see (3.8). Since the underlying 
expectation is divided by /i, this leads to a term in h"^/'^ M"^/'^ . 

Truncation error iS(trunc). As written in Theorem 4.1, it depends on 
R and p, where R denotes the radius of the initial grid Co and R + p the 
radius of the grids {Ck)i<k<N ■ If P tends to -|-cxd, that is, if the grids are not 
truncated, this error term reduces to zero. 

Generally speaking, if (trunc) appears as the Bienayme-Chebyshev esti- 
mate of the probability that the approximating process X stays inside the 
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grids {Ck)o<k<N ■ The lack of relevant estimates of the discretized version of 
the drift b (recall that the function b is not bounded) and, more specially, of 
the discretized gradient v, explains the reason why the Bienayme-Chebyshev 
estimate applies in this framework and not better ones (as the Bernstein in- 
equality). We also recall that the unboundedness of the coefficients is the 
most common case in the applications, see, for example. Section 5.2. 

Gradient error i5(gradient,p). This extra error is generated by the lack 
of estimates of the discretized gradient v. This term follows from the specific 
choice of the predictor v made in Section 3.3 and appears in the second step 
of the proof of Theorem 4.1; see, more precisely, Sections 7.1 and 7.3. 

The convergence of f (gradient, p) toward relies on the term hp/"^ M~^/'^ 
6~P, M being chosen large enough and p as large as necessary. In short, this 
reduced form represents the probability that the distance between the Gaus- 
sian increment and its quantization exceeds the spatial step 6. Note, indeed, 
from (3.8) that, for every p>2, F{\AB'' - g{AB'')\ > 6} < Cq^^ntUp, d)hP/^ 
M~pI<^5~p. Thus, the error term (gradient, p) depends on the ratio between 
the spatial discretization step and the quantization distortion of the under- 
lying Gaussian increments. 

The above probability appears in the control of the distance between the 
predictor v and the true gradient v. In this frame, the strategy consists 
in writing the predictor v as an expectation with respect to the Gaussian 
kernel and not to its quantized version. Generally speaking, this strategy 
holds when the quantized transition T{tk,x) and its Gaussian counterpart 
belong to the same cell of the spatial grid, that is, when the distance between 
the Brownian increment and the quantized one is of the same order as the 
length of a given cell. Since the spatial grid step is given by 6, we then 
need to control the probability that the difference between the increments 
exceeds S. 

Of course, when b does not depend on z, there is no reason to define v. 
In such a case, i5 (gradient, p) reduces to 0. 

4.2. Comments on the rate of convergence. 

Error in function of h. To detail in a more explicit way the rate of 
convergence given by Theorem 4.1, we give an example in which p {p < +oo), 
5 and M are expressed as powers of h. Assume, indeed, that p, 5 and M are 
chosen in the following way: p = Rh'^/'^, d = h^+"< , M''^!'^ = h^+'^ , 7, /3 > 0. 

In such (gradient, p) = exp[ln(/i)[p(/3/2 — 7) — {d/2 + 1 +jd)/2]]. 

To ensure the convergence of the algorithm, we then need to choose 



p{(3/2 - 7) - {d/2 + 1 + -fd)/2 > 



/3>27 + (l/p)(d/2 + l + 7(i). 
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Put finally /? = 27 + {d/2 + 1 + ^d) /p + rj, r] > 0. The rate of convergence of 
the fully implementable algorithm is given by sup^^^^ |u(0, x) — n(0, x)p < 

Taking 7 = 1/2 and 77 = 1/p then yields sup^gj-^j \u{0,x) — n(0, x)p < 
C4 ih. In particular, for p large enough, the exponent f3 is close to 1 and the 
number M of points needed to quantify the Brownian increments is close to 
h~'^. Here is the limit of the method: for a large d and a small h, we need a 
rather large number of points for the Gaussian quantization. Recall anyhow 
that the Gaussian grids are computed once for all. Thus, the numerical effort 
to get sharp quantization grids can be made apart from our algorithm. 

Estimates of V^^u. The reader might wonder about the estimate of the 
gradient of u. Note in this framework that two strategies are conceivable. 

First, the probabilistic counterpart of Theorem 4.1 given in Section 4.3 
provides an estimate of the distance between v and the gradient of the 
true solution. Note, however, that the underlying norm is taken with 
respect to the distribution of the discrete process X [cf. (3.12)]. 

To get a joint estimate of the solution and of its gradient with respect to 
the supremum norm, the reader can apply the following strategy: differenti- 
ate if possible the PDE (£) and apply, once again if possible, Algorithm 3.1 
to {u, Vxu), seen as the solution of a system of parabolic quasi-linear PDEs. 
Such a strategy is applied in Section 5 to the solution of the porous media 
equation and to its gradient. Note that this approach coincides with the one 
followed by Douglas, Ma and Protter [10]. 

4.3. Estimates of the discrete processes. We now translate Theorem 4.1 
in a more probabilistic way. Recall indeed that, in several situations (e.g., 
in financial mathematics), the knowledge of the triple (U, V, W) is as crucial 
as the knowledge of the couple (u, V^u). 

We then prove that {X,Y, Z) and {U, V, W) get closer in a suitable sense 
as h, 6, and vanish. Note, however, that we are not able to prove 

that the distance between {X,Y,Z) and {U,V,W) over the whole inter- 
val [0,r] tends to zero. Indeed, since the projections (nj)o<i<Ar map every 
point outside the sets (Ai)o<j<Ar onto the boundaries of (Cj)o<i<Ar [see, e.g., 
(3.17)], we do not control efficiently the transition of the process X after the 
first hitting time of the boundaries of the grids hy X. It is then well under- 
stood that we have to stop the triple {X, Y, Z) at this first hitting time. Put 
to this end 

roo = inf{(tfc)i<fc<^,Xt,_^ +T{tk^i,Xt,_,)^Ak}, inf(0) = +cx). 

(4.1) 

First, as a bypass product of the proof of Theorem 4.1, the function v pro- 
vides an approximation of v in the following sense: 
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Theorem 4.2. Let p > 2. Then, there exist two constants c^ 2 ^'^'^ 
2, only depending onp and on known parameters deriving from Assump- 
tion (A), such that, for h < 2^ ^"^ < M"'^/'^ < h and p>l, 



i=Q 



Moreover, the triple {X,Y, Z) stopped at time Too satisfies the following: 

Theorem 4.3. Let p > 2. Then, there exist two constants c^ ^ and 
^, only depending onp and on known parameters deriving from Assump- 
tion (A), such that, for h,5,M as in the previous theorem 



E 



sup \Xt^Ar„ 

■ie{o,...,N} 



U, 



+ E 



sup \Yt^Ar^ 
i€{0,...,N} 



N-1 



i=0 

5. Numerical examples. In order to compare the results we obtain with 
our algorithm to a reference value, we choose equations that admit an ex- 
plicit solution. In this frame, we focus on three examples: the one-dimensional 
Burgers equation, the deterministic KPZ equation in dimension two and the 
one-dimensional porous media equation. 

5.1. One-dimensional Burgers equation. Consider first the backward Burg- 
ers equation: 



(5.1) 



dtu{t, x) — (udxu) {t,x) -\- —d'^ .j,u{t, x) = 0, 

[0,T[xM,e>0 



u{T,x) = H{x), x£R,HeCf 



2+ai 



),aG]0,l[. 



Using a nonlinear transformation, one can derive an explicit expression of 
the solution of (5.1). This is known as the Cole-Hopf factorization, see [23], 
Chapter IV, or [24], Chapter III, for details. The solution of (5.1) then writes 



(5.2) V(t,x) G [0,r] X 



u{t, x) 



E[H{x + eBT-t)(p{x + eBT-t)] 
E[0(a; + eBT-t)] 



where S is a standard Brownian motion and 
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From the explicit representation (5.2), we can derive numerically, using, 
for example, a Riemann sum, a Monte Carlo method or a quantized version 
of the expectation (5.2), a reference solution to test the algorithm. 

The reader may object that the Burgers equation is actually semi-linear 
and not quasi- linear. Actually, it depends on whether we consider the nonlin- 
ear term as a drift or as a second member. We describe below the algorithms 
associated to these two points of view, even if the coupled case is the only 
one to fulfill Assumption (A). 

Moreover, in the forward-backward representation of the Burgers equa- 
tion, the estimation procedure of the gradient is not necessary to compute 
the approximate solution u. Numerically, this case turns out to be the most 
robust. Finally, in both cases, the intermediate predictor v is useless: in the 
coupled case, the drift of the diffusion U reduces to V (and thus does not 
depend on W), and in the decoupled one, the drift vanishes. 

5.1.1. Explicit expression of the algorithms. For a given final condition 
H G Cft^+"(M), Q G ]0, 1[, we write the following: 

Algorithm 5.1 (Coupled case). 

VxgCat, u{T,x) = H{x), 

yk£{0,...,N -1}, VxGCfc, 

u{tk,x) = E[u{tk+i,nk+i{x - u{tk+i,x)h + eg{AB'')))], 

7;(tfc,x) = /i-iE[t2(tfc+i,nfc+i(x-n(tfc+i,rE)/i + e5(AB^')))5(A5'=)]. 

Algorithm 5.2 (Pure backward case). 

VxgCat, u(T,x) = H{x), 

yk£{0,...,N-l}, VxGCfc, 

u{tk,x) = E[u{tk+i,'nk+i{x + eg{AB'')))] - he~^u{tk+i,x)v{tk,x), 
vitk,x) = h~^E[u{tk+i,Uk+iix + egiAB^)))g{AB'')]. 

5.1.2. Numerical results. In order to avoid first to truncate the grids, 
we choose a periodic initial solution. Put to this end H{x) = sin(27rx) and 
derive from (5.2) that u is 1-periodic. This allows to define u{tk, ■) on Coo by 
setting y X £ Coo, u{tk,x) = u{tk,x— \ x\). Hence, we can set = Coo for k G 
{0,...,Af- 1}. For T= 1,5= 10-3, /i = 0.01, M = 160, e = 0.15, we present 
below the results of the previous algorithms. The explicit solution given 
by (5.2) is approximated by quantization techniques with a 500 points grid. 
We plot below some profiles of the reference value for various discretization 
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times, as well as the pointwise absolute error between this reference solution 
and the approximations obtained with our algorithms. See Figure 1. 

On the profiles of the explicit solution, the abscises of the peaks of the 
initial sinusoidal wave are going closer to each other up to a given time tQ. 
This is a typical shocking wave behavior. Because of the viscosity, that is, e 
is nonzero, there is no shock and the amplitude of the wave decays when t 
goes to zero. 

From a numerical point of view, the coupled case provides several ad- 
vantages. First, the convergence of Algorithm 5.1 does not rely on the dis- 
cretization procedure of the gradient. In short, there is no reason to update 
the gradient in order to obtain the approximate solution with the first algo- 
rithm. The computation of v just provides in this case an estimate of the 
gradient. At the opposite, this computation is necessary in Algorithm 5.2. 

Moreover, since the coefficient f{y, z) = £~^yz is not globally Lipschitz in 
the pure backward case, it is then another story to establish the convergence 
of Algorithm 5.2. 

These theoretical remarks are confirmed by the pictures below. Even 
though Algorithm 5.2 does not behave too poorly, it is still less precise 
than Algorithm 5.1. The factor between the absolute pointwise errors of the 
two algorithms is approximately 5. 




Fig. 1. 
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Truncation error. We now illustrate the effects of truncation and deal 
with a nonperiodic final data. Namely, we take H{x) = exp(— T = 1, 
h = 0.02, p = 3, 6 = p/500, M = 250. The reference value, see profiles below, 
is computed from the Cole-Hopf explicit solution by quantization techniques 
with a 500 points grid. We run Algorithm 3.1 with the previous parameters 
to obtain Figure 2. 

Choose now R = l: the expected truncated error i5(trunc) is given by 0.25, 
whereas the absolute point-wise error between both solutions is bounded 
by 0.05 on [—1,1]. This emphasizes the difficulty to control the truncation 
procedure in our algorithm. There are two possible arguments to explain 
this difference between 0.25 and 0.05. First, as explained in Section 9.1, 
our way to estimate iS(trunc) is suitable for unbounded drifts b and, more 
particulary, for drifts depending on the gradient. In our case, the drift is 
bounded (since the solution is bounded by 1), and most relevant estimates 
could apply. Second, the fast decay of the final condition H may explain the 
low influence of distant points on the values of the solution on [—1, 1]. 

Note also that the relative error is close to 0.1 on [—1,1]. A possible 
strategy to decrease it would consist in refining the spatial mesh. 

We also feel that the choice of the rough projection mappings (nfc)o<fe<7v 
deeply affects the global error. To investigate more precisely their influence. 




Fig. 2. 
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Fig. 3. 



we replace them by standard linear interpolation procedures (which are de- 
fined in an obvious way since the underlying space is one dimensional). In 
short, this permits to extend continuously the approximated solution u to 
the whole space. With the same parameters as above, we then get Figure 3. 

Numerically, the interpolation can thus be really relevant to improve the 
convergence (see Section 9.2 for further details and explanations on this 
point). To obtain the same precision without interpolation, we need to re- 
fine significantly the parameters (taking, e.g., S = 2 x 10~^). Let us finally 
mention that the results obtained with the coupled representation and the 
linear interpolation are still more accurate than with the backward one. 

5.2. Deterministic KPZ equation. In this subsection we focus on the so- 
called "deterministic KPZ" equation (see, e.g., [12] and [24], Chapter I, for 
a physical interpretation): 

dtu{t,x) + ]^iT{(ja*V\^u{t,x)) + '^\a*V :,u{t,x)\^ = 0, 

(5.3) (t,x) G [0,r[xM'^, 

u{T,x)=H{x), xGM'^, 

where v G M+* is a given parameter and a a given constant matrix such that 
aa* is positive definite. 

Such an equation admits too a "Cole-Hopf explicit solution" (see again 
[12]) that writes u{t,x) = z>~Mog(E[exp(z^ff(x + oBT-t))])- We then apply 
Algorithm 3.1 to (5.3) seen as a true quasi-linear equation (so-called "cou- 
pled case" in the former subsection). 

Concerning the initial condition, we choose H{x) = nf=i sin(27r3;j). By 
construction, we have Vx G W^, V/c G U^, u{t,x + k) = u{t,x). Since the so- 
lution is periodic, u can be defined on the whole grid Coo (see also Sec- 
tion 5.1.2). We now present the results for d = 2, v = 0.3, T = 0.5, h = 0.02, 
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5 = 5x 10"^, M = 160 and aa* = ^) with 6 = 0.8. The reference value 
and its gradient have been derived from the exphcit writing of u using quan- 
tization techniques with a 500 points grid. At t = 0, one has Figure 4. 

The relative error between the approximate and true solutions is at most 0.25. 
The explanation seems rather simple: the explicit solution quickly decays as 
time decreases. Anyway, we feel that our algorithm manages to catch this 
specific decreasing phenomenon. 

Let us also mention that the last picture represents the pointwise dif- 
ference of the true and approximated gradients, but the control given by 
Theorem 4.2 just holds in L^. 




Fig. 4. 
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5.3. Porous media equation. To conclude this section, we focus on the 
equation (this example is taken from [16]) 

dtu{t,x) + {udl,j.u){t,x) + {dxu)'^{t,x) + u^{t,x) = 0, 

(5.4) (t,x) G]0,T] X M, 

u(r,x) =T-^-cos2f^Y L = 2V2tt, 



3 \L 

which admits the L-periodic explicit solution 



, , _i 4 ^fT^x\ 
u{t,x) =t 3 cos ( — 1. 



Note that (5.4) does not fulfill Assumption (A). In the sequel, we choose 
without any rigorous justifications to apply Algorithm 3.1 on [T/2,T] (note, 
however, for a rough explanation that the quadratic growth of the coefficients 
ensures that Theorem 4.1 holds on a suitable interval [t, T] , for t close enough 
to T and, in the same way, Theorem 2.1 applies away from 0). 

Nevertheless, as explained in Section 4.2, this procedure just provides 
an L^-estimate of V^m. In this framework, we have decided to apply the 
so-called "differentiated" approach, described in Section 4.2, to obtain a 
pointwise estimate of VxU (see Algorithm 5.3 below). 

Note finally from the periodicity of u that u can be defined on the whole 
grid Coo as in the previous example (see also Section 5.1.2). 

Algorithm 5.3 (Differentiated algorithm). 
\/x£Cn, u{T,x)=T gCO^^l"^)' 

«-.(T,.-) = r-(--cos(-)sm(-)), 

VA;E{0,...,iV- 1}, VxeCfc, 



u{tk,x) = E[u{tk+i,Ilk+i{x + w{tk+i,x)h + ^^2u{tk+l,x)g{AB )))] 
+ hu{tk+i,xf, 



w{tk,x) = E[w{tk+i,Ilk+i{x + 3w(tk+i,x)h + ^ 2u(tk+i,x)g{AB^)))] 
+ 2hu{tk+i,x)w(tk+i,x). 

For T = l,h = 0.02, 6 = L/500, M = 160, we present below the results 
obtained first with Algorithm 3.1 (the approximation of the gradient with 
this algorithm is undefined at x = ibL/2 and we thus arbitrarily set it to zero) 
and then with Algorithm 5.3. See Figure 5 for the results on [— L/2,L/2]. 
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We first observe that the approximated solutions obtained with the two 
algorithms are not significantly different. The main advantage of the differ- 
entiated algorithm is, as expected, for the pointwise approximation of the 
gradient. Indeed, in that case there is a factor 4 between the absolute point- 
wise errors associated to the two methods. Let us also indicate that both 
methods present some "singularity" in the neighborhood of x = ibL/2 for 
the estimation of the gradient. This could be expected for Algorithm 3.1 
since the estimation of the gradient is obtained by dividing v by \/2^ that 
goes to when x ibL/2. It is a bit more surprising for Algorithm 5.3. 



* 1 I p 4 » 




r 1 4 



Fig. 5. 
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6. Proof. First step: a priori controls. In this section we give various a 
priori estimates of the couple (Y, Z) introduced in (3.13) and of the approxi- 
mate diffusion X defined in (3.12). These controls are necessary to establish 
Theorems 4.1, 4.2 and 4.3. 

About constants. In the following, we keep the same notation C,C^,c,q 
(or C', C^, c^) for all finite, nonnegative constants which appear in our com- 
putations: they may depend on known parameters deriving from Assump- 
tion (A), on T and on p, but not on any of the discretization parameters. The 
index •& in the previous notation refers to the numbering of the Proposition, 
Lemma, Theorem, . . . where the constant appears. 

Conditions on parameters. We assume that the conditions of Theorem 
4.1 on /i, 5, M, p and p are fulfilled. 

6.1. Discrete backward equation and a priori estimates. 

Discrete Feynman-Kac formula. By iteration of the dynamic program- 
ming principle in Algorithm 3.1, it is plain to prove the discrete Feynman- 
Kac formula (3.14). 

Both formulae (3.14) and (3.15) [representation of Itjv + ^J2iLi fi^u^i, 
u{ti,Xti_-^), Zti_i^) through the martingale representation theorem] permit 
to apply the BSDE machinery to our frame. However, as well known in 
the literature devoted to SDEs (or, equivalently, to PDEs), several a priori 
estimates of the solution are necessary to apply this strategy. 



Proposition 6.1. There exists a constant Cg j s.t. 



sup 

1=0,..., AT 



sup \u{ti,x)\^ 



<C 



6.1- 



Proposition 6.2. There exists a constant Cg 2 



E 



iZJ'^ds 



+ hJ2E[\Ztf] 



1=0 



h sup 

i=0,...,N 



sup \v{ti,x)\" 



+ h sup 

i=0,...,N-l 



sup \ v{ti,x)\" 



<C 



6.2- 



The distance between Z and Z can be estimated as follows: 



Lemma 6.3. There exists a constant Cq ^ s.t., for k G {1, . . . , A^}, 



E 



hZt, 



E 



tfe 



Z, ds 



tk-i 



< 



C(j ^h"^ S'^ (quantiz) . 
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6.2. Approximate diffusion. 

Jumps of the discrete forward process. Start first with the following: 

Lemma 6.4. For a given k £ {0, . . . ,N — 1}, the norm of the increment 
Xj^^^ — Xtf. is always bounded by |T(tfc,X(^)| +S. In particular, there exists 
a constant Cg ^ such that 

E[\Xt^^^-Xtf\j^,^]<Ce,^[h + s']. 

Proof. Since e C^, onehasUoo{Xt^^+T{tk, Xt^^)) = Xt^+Uoo{T{tk, 
Xt,^)) (invariance by translation of the grid Coo). Moreover, for every y in 
the image of the projection Q{R + p, •) and for every z G M^, the distance 
\Q{R + p,y + z) — y\ is bounded by \z\. Hence, 

^ ^ \Xt,^, -Xt,\ = \ Q{R + p, Xt, + noo(T(tfc, J)) - Xt^ I 
(6.1) 

<\Ii^{T{tk,Xt,))\<\'T{tk,Xt,)\ + 5. 

Thanks to Propositions 6.1 and 6.2, we are able to bound the drift b appear- 
ing in the transition. Since E[|5(AS^)|2] < C/i, from Assumption (A) and 
Proposition 6.1, we also control the martingale part of the transition. This 
completes the proof. □ 

Extension of the discrete diffusion" . For the proof, we need to extend 
the definition of X to the whole set [0, T] . Put, for allA;G{0,...,A^ — 1} and 

t G [tk,tk+i[, 

Xt = Xt, + b{Xt, , u{tk+i ,Xt,), v{tk ,Xt,)){t- tk) 

(6.2) 

+ a{Xt„u{tk+i,Xt,))[Bt-Bt,]. 
From Proposition 6.2, we get the following: 

Lemma 6.5. There exists a constant Cq 5 s.t., for every k ^ {0, . . . , N — 
1}, 

Vt G [tk,tk+i[, n\Xt- XtflJ^t,] < Cg^sh. 

The extended process {Xt)o<t<T is discontinuous at times {tk)i<k<N- At 
a given time t^, 1 < A; < A^, the size of the jump performed by the process 
depends on the quantization error and on the spatial projection error. The 
first error is easily controlled by the distortion. Concerning the second one, 
the projection error is close to the spatial step 6 when the grids are infinite. 
For truncated grids, the story is slightly different. In fact, as soon as the 
process stays inside {Ak)o<k<N , the projection error is close to the step 5 of 
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the interior mesh of the grid {Ck)o<k<N- At the opposite, outside (Afc)o<fc<Ar, 
the jump of the process may take large values. 

The time continuous extension of X remains close to the discrete version 

of X up to time Too- 

Lemma 6.6. There exists a constant Cq q such that 

N-l 

H'^{U+^<T^}\Xu+^ - < (space) +£:2(quantiz)). 

i=0 

Proof (Sketch). From (6.2), the difference Xt^^^ — writes 



X, 



Xt 



(6.3) 



[n,+i(Xi^ +r(t„Xij) - {Xu+T{t,,Xt^))] 

+ a{Xu , u{ti+i , XtMai^B') - /\B'] 
= Ei{i + l) + E2{i + l). 

Ei{i + 1) appears as a projection error and E2{i + 1) as a quantization one. 
It is readily seen that Ei{i + 1) is bounded by 5 on {tj+i < Too}- From (3.8), 
one also gets E[|^2(i + l)Pl^tJ < ChM-'^/'^. □ 

6.3. Sketches of the proofs of the a priori controls. 

Discrete BSDE. This section is devoted to the proof of Propositions 6.1, 
6.2 and Lemma 6.3. We first give a control of the norm between Ztf._^ 
and the conditional expectation of f^^ ^ Zg ds appearing in Lemma 6.3. This 
preliminary estimate permits to prove Proposition 6.1. We then derive the 
complete proofs of Proposition 6.2 and Lemma 6.3. 

Step one: preliminary control in Lemma 6.3. From (3.15), write, for a 
given ke{0,...,N -1}, 

Yt,^, + hf{Xt, , u{tk+i,Xt,),Zt,) = Yt, + Z, dBs. 

Multiply this identity by AB^, take the conditional expectation w.r.t. J-t^ 
and plug the definition of Zt,, [cf. (3.13)]: 

rtk 



(6.4) hZt 



E 



■fc+i 



Zo ds 



t-k 



E[Y,^^M^B')-^B')\^t,]- 



Referring to (3.8), there exists C s.t. 



(6.5) 



E 



hZ, 



-E 



Zs ds 



<ChM-'^l'^E\Y? 1. 



This preliminary estimate (6.5) is necessary to prove Proposition 6.1 from 
which we will derive E[l^^^^] < C, and thus complete the proof of Lemma 6.3. 
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Step two: proof of Proposition 6.1. To estimate the supremum norm of 
u over the grids Co, ■ ■ ■ ,Cn, we follow the basic strategy of the BSDE theory 
and, therefore, apply a discrete version of Ito's formula to the discrete BSDE 
formula given in (3.14)-(3.15). Such a formula can be found in [22], Chapter 
VII, Section 9. We obtain 

TV 

EIYtI^ = \Yo\^ + 2hY,n-fiXu-^,n{U,Xu_,),Zt^_,),Yt^_,) 

i=l 

(6.6) 



'7 12^5. 



TV -p 

+ h^J2^[fHXu,„uiti,Xt^_,),Zt^_J]+E [ |2„ 

Following standard computations in BSDE theory, it is plain to derive from 
(6.5) and (6.6): 

rp N-1 N 

(6.7) |u(0,xo)P+e/ \Zs\^ds + hY,MZtf]<C + ChJ2sup\u{ti,x)\'^. 

Jo j=o i=0 ^'^'^i 

There exists a constant c > such that, for h < c (recall indeed that h is 
small), the above inequality holds but with i = 1 instead of i = as initial 
condition in the r.h.s. of (6.7). As usual in BSDE theory, we can establish 
in a similar way that, for every initial condition (tfc,x), 1 < k < N , 

N 

yk£{0,...,N -1}, sup |n(tfc,x)p <C + C/i ^ sup |u(ti, x)p. 

A discrete version of Gronwall's lemma yields the result. 

Step three: proofs of Proposition 6.2 and Lemma 6.3. The L^-estimates 
of Z and Z in Proposition 6.2 follow from Proposition 6.1 and (6.7). More- 
over, as a consequence of Proposition 6.1 and the definitions of v and v, see 
Algorithm 3.1, we deduce the estimates of the supremum norms of v and v. 
Lemma 6.3 follows from (6.5) and Proposition 6.1. 

7. Proof. Second step: stability properties. This section focuses on the 
second step of the proof of Theorems 4.1, 4.2 and 4.3, and aims to establish 
more specifically a suitable intermediate inequality, close to usual stability 
properties of FBSDEs. 

Strategy. Recall first that two main strategies are conceivable in the the- 
oretical framework to establish classical stability theorems for FBSDEs. 

Denote to this end by {U', V , W) a solution of another FBSDE of type {E) 
with different coefficients. The associated PDF solution is just denoted by u' . 
In order to compare u' with u, the following approaches have been employed 
in the literature: 
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1. First, the recent induction principle given in [6] can be applied. In short, 
u and u' are compared on a neighborhood of the boundary T with clas- 
sical arguments of stochastic analysis and the estimate of the difference 
between these solutions is then extended by induction from the final 
bound T to the initial bound 0. The local estimates consist in studying 
the distance between U and U' and between (F, P^) and {V',W'). This 
strategy has been successfully applied to various contexts (see [6] for the 
solvability of FBSDEs and [8] for homogenization of quasilinear PDEs). 

2. A second approach follows the earlier Four Step Scheme of Ma, Protter 
and Yong [14] . Instead of studying the difference between U and U' and 
between (V^iVF) and {V',W'), the process {u{t,Ul))Q<_t<T is written with 
Ito's formula as the solution of a BSDE. This BSDE is then compared 
with the one satisfied by {V ,W'). In particular, these BSDEs are both 
written with respect to the same diffusion U' . Generally speaking, this 
strategy holds when u is smooth enough (e.g., if u satisfies Theorem 2.1). 
It is then more direct than the previous one. 

Under Assumption (A) we apply the second strategy and compare the pro- 
cess Y with the process {u{t,Xt))Q<t<Tf\T^ [see (6.2) for the definition of 
the extension of X] . 

7.1. Statements of the stability results. 

First stability property. Applying the usual FBSDE machinery, we are 
able to establish in Section 7.2 the following first inequality: 

Proposition 7.1. There exists a constant Cy ^ such that, for rj small 



enough, 



N 



\{u - u)(0,xo)P + C7>^E[|(^; - v){t,.,,Xt^_,)\\,^ 




< C'/j P{too < +00} + £:^(time) + .5^ (space) + <S^(quantiz) 



N 



(7.1) 



")(*i>^t,-l)l^l{t,-l<r^}] 



N 



+ r,-'hY,E[\{u-u){t,-uXt^_,)\\t,.,<r^}] 



N 



+ {v + h)hY,E[\{v-v){t,.uXt^-,)\'l{t, 




30 



F. DELARUE AND S. MENOZZI 



When the drift b does not depend on z, the last term of the r.h.s. does not 
appear. 

Estimates of the gradient increment. Assume for the moment that Propo- 
sition 7.1 holds. Note that the main problem then remains to estimate the 
last term in the r.h.s. of (7.1). Thanks to the specific choice of v in Section 
3.3, we are able to establish in Section 7.3 the following control: 

Proposition 7.2. There exists a constant Cy ^ such that, for k G {0, . . . , 
iV- 1}, on {tk<T^}, 

\{v - v){tk,Xti^)\ < C7_g[f(gradient,p) +<S(time) + h£{space) 

+ E[\iv-v){tk+i,Xt^,^,)f\J't,f\ 

Main stability theorem. From Propositions 7.1 and 7.2, we claim the 
following: 

Theorem 7.3. Proposition 7.1 holds with the last term in the r.h.s. of 

(7.1) replaced by 

N 

£\gmdient,p) + i7^ + h)hY,E[\{v-v)it„Xt^)\\,^_^^,^y]. 

i=i 

Application of Theorem 7.3 to the proof of Theorems 4.1, 4.2 and 4.3 is 
given in Section 8. 

7.2. Proof of Proposition 7.1. 

Starting point: time continuous backward processes. Following the sec- 
ond strategy and referring to the structure of the PDF set for notational 
convenience 

(7.2) ViG[0,r], Vt = u{t,Xt), Wt = VMt,Xt)a{Xt,Vt). 
Note, moreover, that the martingale part of {Vt)o<t<T is driven by 

(7.3) VtG[0,r[, Wt = VMt,Xt)a{X^^t),um) + h,X^^t))), 

where (j){t) = t^ for t^ < t < tk+i, k e {0, . . . , - 1}. From Theorem 2.1 
and Lemma 6.5, we derive the following a priori estimates of V,W for 

s G [tk-,tk+l[' 

(7.4) E[\Vs-Vt,\ + \Ws-Wt,m,] < Ch'/\ 
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Step one: Itd^s formula for V . Using Ito's formula and the PDE satisfied 
by u, we obtain, for i £ {0, . . . ,N — 1}, 

Vu^, -Vu =Vu^, -Vu+,- + p^\F{s,X,,Xt^,u{U+,,XtJ,v{U,XtJ) 

-F{s,Xs,Xs,Vs,Ws)]ds 

- f'^''' f{X,,Vs,Ws)ds+ r^'WsdBs, 
Jti Jti 

with F{s, X, X, y, z) = {'Vxu{s, x),b{x, y, z)) + (1/2) tr(a(x, ?/)V^ x'^i^^ ^))- 

Step two: difference of the processes. The strategy is well known: we aim 
to make the difference between V and Y and then to apply the usual BSDE 
machinery to estimate the distance between these processes. Hence, we claim 
from (3.15) 

^ti+i - ^ti+i - \Vti - Yt-] 

= Vti+i - Vti+i- 
fti+i 



+ f^^\F{s,Xs,Xt^,u{ti+i,XtJ,v{ti,XtJ)-F{s,Xs,Xs,Vs,Ws)]ds 
Jti 

- [/ {Xs,Vs,Ws)-f {Xt, ,u{U+i,XtJ,Zu)] ds 
Jti 

+ r^\Ws-Zs]dBs 
Jti 



= AEi+i{l) + AE,+i(2) + A^,+i(3) + AEi+i{4). 

The discrete ltd formula [see the derivation of (6.6)] and standard compu- 
tations yield 

(7.5) |Fo - Yof + iZ)(3) < EIFtat^ - Yr^^rJ^ + D{1) + D{2), 
with 

N 

D{i) ^ -T&Y.Wh^.<r^}Wi,-. - y^-.\H^ 

N N 

(7.6) Di2) ^J2nht,^,<r^}E]], Di3) = ^E[1{,^_^<,^}A£;,(4)2], 

Ej =AE,il) + AE,{2) + A^,(3), j G {1, . . . , iV}. 

Step three: standard BSDE techniques. Following the BSDE techniques, 
we have to upper bound D[\),D{2) [resp. lower bound D[2>)] by terms ap- 
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pearing in the r.h.s. (resp. l.h.s.) of (7.1). The fohowing lemmas whose proofs 
are postponed to the end of the subsection give the needed controls. 

Lemma 7.4. Denote by RHS(7.1) the r.h.s. of (7.1). Then, there exists 
a constant Cy^ such that, for rj g]0, 1], 



\D{l)\+D{2)<C 



RHS(7.i) 



N 



+h{i^+h)Y,mv-v){t,^i,xt^_,)\H{,^_^^^^}] 

Lemma 7.5. There exists a constant Cj ^>0 such that 

N 

Di3)>Cr^^^hJ2Emt^_^^^^^\{v-v){tj_,,Xt^_^)\^] 
— Cy 5(<?^(quantiz) + i5^(time)) 

N 

-Cj,5hY,Emt^_^^,^}\iu-u)it„Xt^^,)\\ 

Note to conclude the proof of Proposition 7.1 that Yt = Vt- Hence, from 
Theorem 2.1 and Proposition 6.1 (boundedness of u and u), E|yT/\T-_^ — 
YT^T^? < CF{toc <T}< CP{too < +00} . Choose finally t] smah enough to 
obtain inequality (7.1) from (7.5), (7.6), and Lemmas 7.4 and 7.5. This com- 
pletes, up to the proofs of Lemmas 7.4 and 7.5, the proof of Proposition 7.1. 

Proof of Lemma 7.4. Note from Theorem 2.1 that AEj{2) and AEj{3) 
may be seen as "Lipschitz" differences since the partial derivatives of u of 
order one and two in x are bounded. Recall also that Vs = u{s, Xs), Wtj_-^ = 
v{tj-i, Xt^_^) and = u(tj_i, J. From Theorem 2.1 (Holder regu- 

larity of u in t), (7.4) (regularity of V and W), Lemma 6.5 (control of the 
increments of X) and Young's inequality, it comes, for every r/ s]0, 1], 

\D{1)\ < CS^itime) 

N 

+ CEY^m,^_,^r^y\Vt,_,-Y^_,\\Vt, -Vt,-\] 
i=i 

N 

(7.7) +ChY,[v~'mu-u){tj^uXt^_,)\\t^_^^,^^] 
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N 

i=i 



}] 



It now remains to estimate the second term in the r.h.s. of (7.7). Note first 
that, for all j G {1, . . . ,N}, {tj-i < Too} = {tj < Too} U {tj = Too}. Hence, 
thanks to the boundedness of u and u (see Theorem 2.1 and Proposition 
6.1), to Lemma 6.6 (jumps of the process X) and to the global Lipschitz 
property of u (see Theorem 2.1), 

N 

^T.ihh-i<ro.}[\yt,.,-Yt^_,\\vt, -vt^-\]] 

N 

i=i 

(7.8) + CF{too < +00} 

N 

<ChEY,[l{t^^r^A{u-u)it,^^,Xt^_,)\^] 
i=i 

+ C(£:^(space) + .S^(quantiz)) + CF{too < +00}. 

Plug (7.8) in (7.7) to derive the required control for D{1). 

Turn to the estimation of D{2): apply again Lemma 6.6 to control AEj{l), 
for a given j S {1, . . . ,N}, and treat the "Lipschitz" differences as done to 
estimate D{1). □ 

Proof of Lemma 7.5. Write first 
hJ2^mt,^,<r^}\vitj-i,Xt^_,)-vitj.i,Xt^_,)\^] 



TV 



4tj-l<-rcx)} 



v{tj_i,Xt^_^)-^E 









2- 


/ Zg ds 















+ E 


l{ij-l<roo} 




+ E 


-'-{ij_l<Too} 





tj-l 



[Zs - Ws] ds 



[Ws-v{tj.i,Xt^_,)]ds\:Ft.^_, 



A{l) + A{2)+A{2,). 



34 F. DELARUE AND S. MENOZZI 

From Lemma 6.3 (distance between Z and Z), we then derive A(l) < 
CiS^(quantiz). For the term ^(2), the Cauchy-Schwarz inequality yields 
A{2) < CD{3). Concerning A(3), we get 

N 



A{3)<CJ2^ 



X \V,j:u{s,Xs)a{Xt^_-^,u{tj,Xt._^)) 

- V^u{tj^i,Xt^_^)a{Xt^_^ , n(tj_i, ds 

Following the techniques employed in the previous proof, relying on the 
smoothness of the true solution (see Theorem 2.1) on the boundedness of 
the approximate solution, see Proposition 6.1, and on intermediate controls 
of the process X, see Lemma 6.5, we get 

TV 

2i 



^+J2^[Mt,.,<r^}\uitj,Xt^^,) - u{tj,Xt^_, 



^(3) < Ch 

The above estimates of A(l), ^4(2), ^4(3) complete the proof. □ 

7.3. Proof of Proposition 7.2 (difference of the gradients). 

Strategy. In Proposition 7.2, we aim to control the quantity \{v — v){tk, 
XtJ for tk<T^, with v{tk,XtJ = E[vitk+i,Uk+iiXt^ +T'^{tk,Xt^)))\J^t,] 
(see Algorithm 3.1). We first write v{tk,Xt^) in a similar way to study the 
difference {v — v){tk,Xt^). From Theorem 2.1 (regularity of u) and from the 
proof of Lemma 6.4 (with instead of T), we claim 

\n<tk+i,Iik+i{Xt,+T\tk,Xt,)))\rt,] - v{tk,Xt,)\<C[h^'^ + 5]. 

Hence, 

\{v-v){tk,Xt,)\ 

(7.9) 

<cn{^-v){tk+i,Tik+i{Xt,+r\tk,Xt,)))\\j't,] + c{h^/^ + 5). 

Proposition 7.2 directly follows from (7.9) and the next theorem: 

Theorem 7.6. There exists a constant Cy ^ such that on {tk < Tqo} 

E[\{v-v)itk+i,nk+i{Xt,+T'{tk,Xt,)))m,] 

< C7.g£:(gradient,p) + C7.gE[|(7; - v){tk+i, Xt,^,)f\Tt,f\ 



A PROBABILISTIC ALGORITHM FOR QUASLLINEAR PDES 



35 



The main difficulty to prove Theorem 7.6 hes in the lack of regularity 
of V. To overcome this point, note first that 

(7.10) n{v-v){tk+i,Iik+i{Xt,+T\tk,Xt^)))\\Ft,] 
and 

(7.11) E[|(i;-7;)(tfc+i,nfe+i(Xi,+T(tfc,XiJ))|Vtj'^' 

write as expectations of a given function with respect to two different kernels. 
We then aim to compare these underlying kernels. Recall that for a given x € 
Cfc, both T^{ti^,x) and T(tk,x) are, up to a quantization procedure, Gaussian 
random variables with same covariance matrices but different means. The 
strategy then consists in applying a Gaussian change of variable to pass from 
the first kernel to the second one. 

Step one: Proof of Theorem 7.6, exhibition of underlying kernels. We 
first write (7.10) with respect to the underlying kernel T^. Note in this 
frame, with the notation of Section 3.4, that, for every x G M'^, Hk+ii^) = 
Hk+i o noo(a;) since 1100(2;) G x £ A^. Thus, using the invariance by 

translation of Coo (see the proof of Lemma 6.4), (7.10) writes 

E[\iv-v){tk+i,U,+,{Xt^+T^itk,XtJ))mj 

(7.12) = [\iv-v)itk+uUk+i{Xt,+y))\ 

xF{n^iT'itk,Xt,))=y\J't,}]. 
In the same way, the square of (7.11) writes 

E[\{v-v){tk+i,Uk+i{Xt,+T{tk,Xt,)))f\^t,] 

(7.13) = Yl [\iv-v)itk+uUk+iiXt,+y))f 

x¥{Ii^{T{tk,Xt,)) = y\Tt,}]. 

Equations (7.12) and (7.13) provide relevant writings to estimate (7.10) and 
(7.11). Indeed, it is sufficient to bound for a given x G and a given y G Coo 
the probability ¥{Iloo{T^{tk,x)) = y} by (up to a multiplicative constant) 
the probability ¥{Jl^{T{tk,x)) = y] . We set 

S(tfc+i,x) =(j{x,u{tk+i,x)), fi{tk+i,x) = b{x,u{tk+i,x),v{tk,x)). 

Put ||i;(resp. ;u)||oo =supfce|o,...,Af}[suPa:GCfe |S(resp. ;u)(tA;,x)|]. From Assump- 
tion (A) and Propositions 6.1 and 6.2 (boundedness of u and h^^'^v), ||S||oo + 
h^^^M\oc<C. 
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Step two: Proof of Theorem 7.6, comparison of kernels. The proof of the 
following proposition relies on a standard Gaussian change of variable and 
rather tedious computations (the detailed proof is given in Section 7.3 in 
the electronic version [9]): 

Proposition 7.7. There exists a constant Cy y> such that, for every 
y G Coo, 

P{noo(TO(tfc,x)) = y} < ak{y) + miVk + ^^^^{noo{T{tk,x)) =y}), 
where 

akiy) = P{|S(tfc+i,x)(?(A5'=) - y\^ < 6/2, 

\g{AB'') - AB''\^> 6/{2\mU}, 

(3{y) ^ Cr^j5'"^h~''l''eM-Cj\h~^\y\% 

In the above expression, for all z G R'^, \z\^ = maxjgji^ \zi\. 

From Proposition 6.2, h^/'^v is bounded by a known constant. Denote by 
RHS(Xt^ , 7.13) the r.h.s. in (7.13) and by T{h, C) the sum EyeCoo exp[-C-^ x 
Owing to Proposition 7.7 and (7.12), we then get 

Y.[\{v- v) {tk+i , Hfe+i (x + y))\F{U^ (r° (tk ,x))= y}] 
< Ch-'/^F{\g{AB>') - AS^loo > V(2||S||oo)} 

+ C<5'^/2;j-^/4-l/2pl/2||^(^^fc) _ AS'^loo > (5/(4|| S ||oo )} 

(7.14) 

xr(/i,c) 

+ C[6'^h~'^/^T{h, C)]i/2[RHS(x, 7.13)]^/^ 

= r(i) + r(2) + r(3). 

Due to (3.8) and to the Bienayme-Chebyshev inequality, T(l) < C/i^/^~^/^ 
S-PM-P/'^. Thanks again to (3.8) (applied to the exponent 2p), T{2) < 
ChP/^-'^/'^~^/'^6~P+'^/^M~P/'^r{h,C)=C£{gT&dient,p){6h"^/^yr{h,C). 
Note now from (7.13) that 

r(3) = C[5'^/i-^/2p(^,C/2)]i/2E[|(f;-t;)(tfc+i,n,+i(x + r(tfc,x)))|Y/'. 

A standard comparison with a Gaussian integral yields {6h~^^'^)^T{h,C) < 
C". Plugging the different estimates of r(l), T{2) and T(3) in (7.14), we 
complete the proof of Theorem 7.6 [recall again that h~^5'^ is small to dom- 
inate T(l) by i?(gradient,p)]. 



A PROBABILISTIC ALGORITHM FOR QUASLLINEAR PDES 



37 



8. Proof. Third step: Gronwall's lemma. Here is the final step of the 
proof of Theorems 4.1, 4.2 and 4.3. 



8.1. Proof of Theorem 4.1, infinite grids. We first explain how to derive 
Theorem 4.1 from Theorem 7.3 when p = +oo, that is, when Too = +oo a.s. 
In this framework, the term i5'^(trunc) in (global) reduces to 0. The general 
case is detailed in the next subsection. For infinite grids, for r] and h small 
enough, we obtain from Theorem 7.3 and from the equality v{T, x) = v{T, x), 
for all X £ Cn, 



.1) 



\{u-u){0,xo)\^<C 



N 



(global) + sup \ {u — u){tj 



As usual in BSDE theory, the estimate (8.1) holds actually for any starting 
point {tf:,x), < k < N , X £ Ck- Hence, there is no difficulty to apply Gron- 
wall's lemma [at least for h small, as in (6.7)] and to complete the proof of 
Theorem 4.1 when p = +oo. 



8.2. Proof of Theorem 4.1, general case. We now turn to the case of trun- 
cated grids. Generally speaking, most of the approach given in the former 
subsection still applies in the general framework. It is, however, impossible 
to mimic word for word the arguments given above and we need to refine 
the previous Gronwall argument. 



First step. We first aim to get rid of the difference v — v appearing in 
the new r.h.s. in Theorem 7.3. Due to the functions (l{fj_i<roo})i=iv,^' 
the machinery used in the previous subsection does not apply. To overcome 
this difficulty, we write {tj-i < Too} = {tj < Too} U {tj = Too}- Indeed, since 
v(T,x) = v{T,x) for x £ Cn and h\v — is bounded (see Theorem 2.1 and 
Proposition 6.2), we obtain for rj and h small enough 



N 



\{u - u)(0,xo)P + C-'hY,E[\{v - ^)(t,_i,Xi^„J|'l{,^.„^<,^}] 



^.2) 



<C 



P{Too<+oo}+f2^global) 



N 

+ hY,E[\{u-u)itj,Xt^_,)fl{t,.,<r^}] 
j = l 

N 

+ hY,E[\{u-u)itj.l,Xt^_,)\H{t,^,^r^}] 
J=2 
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Even though we employed if ^(global) for notational convenience, we mention 
carefully that the origin of the term i5^(trunc) has not been explained yet. 
It is in the following lines. 

Second step. Note that (8.2) still holds if X starts at a given time ij, 
i G {0, . . . , A^}, from an .T^j. -measurable and square integrable random vari- 
able ^ with values in Cj. In such a case, (8.2) still holds with (0,xo) replaced 
by (ii)0) -^tj by Xl"^ and Tqo by [the superscript (tj,^) denotes the 
initial condition of the process X]. Due to the shift between tj^i and tj 
in the r.h.s., there is no possible choice of to recover the same form of 
terms in the left and right-hand sides. In particular, Gronwall's lemma does 
not apply at this stage of the proof. Note, in fact, that the same problem 
occurred in Section 8.1: this was the reason why the supremum was taken 
in the r.h.s. of (8.1). 

In the current frame, taking the supremum over x £Ci in (8.2) induces a 
new term, namely, sup^-g^i ^{t^'^ < +00} . Unfortunately, for x close to the 
boundary of the grid Cj, the underlying probability is far from being small. 
In particular, there is no hope to prove Theorem 4.1 in the case p < +00 
with the arguments used in Section 8.1. 

Strategy. Our strategy then consists in applying (8.2) to a suitable choice 
of ^. We then have to estimate the probability P{r^'^ < +00} for a random 
initial condition {ti,^,), ^ G L^(r2, JFj, ,P) with values in Cj. To this end, we 
need to control efficiently the tails of the variables (X**'^ t .i)i<j<N- Since 

the drift b is not bounded, a natural approach consists in estimating the 
norms of these variables. 

Lemma 8.1 (L^ control of the process X). For all /c G {0, . . . ,N}, put 
Tk = Too A tfc. Then, there exists a constant Cg ^ such that, for all i G 
{0,...,A^} one? ^ G L^(r2, ,P) with values in Ci, 

VA: G {i, . . . ,iV}, IE[|X*;'«|2] < C7_^,_^[E|^|2 + l + £:2(space) + £:2(gradient,p)]. 

Proof (Sketch). We remove the superscript (ij,^) in the writing of 
X. Then 

Xr,=^ + J2[r{t„Xt^)^,^^,^}] 
j=i 

+ E[(ni+i(^i. +T{t^,Xt.))-Xt^ -r(t„Xij)i|f^.^^<,^}] 
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j=i 

= i + S{l) + S{2) + S{?,). 

The term 5(2) corresponds to a standard projection error. Thus, E[|5(2)p] < 
5'^{k — i)^ < C<?^ (space). For 5(3), Lemma 6.4 and Young's inequahty yield 

j=i 

(8.4) 

fc-i 

< Ch^£\space) + C¥{Too < +00} + C E[|T(tj, .)|^]. 

j=i 

From Propositions 6.1 and 6.2, we can prove that E[|T(tj,Xj^. )|^] < C/i^. 
We finahy deduce {h being smah) E[|5(3)|2] < C[F{too < +00} + f^^^jj^g) _^ 
(space)]. 

Deal now with 5(1). Thanks to Propositions 6.1 and 6.2, we estimate 
the drift, and thanks to the independence of the Brownian increments, we 
bound the martingale part. From Assumption (A), there exists a constant 
C such that 



k-l 

E[\Sil)\'^]<Ch{k-i)' 



j=i 

Apply now Propositions 7.2 and 6.2, and derive that E[|5(l)p] < C[l + 
(gradient, p)]. □ 

Estimate of the probability of hitting the boundary. Thanks to the pre- 
vious lemma, we are now able to estimate the probability P{r^'^ < +00}, 
with (z,^) as in Lemma 8.1. Indeed, {r^'^ < +00} C {|X*^^|oo + 5> R + p}. 
Thanks to the Bienayme-Chebyshev inequality and to Lemma 8.1 (with 
k = N), we get 

P{t^'« < +00} 

(8.5) 

<C[{R + /3)~2e[|CP] + (space) + £:2(trunc) + (gradient, f))]. 
Plug now (8.5) into (8.2) to obtain 
E[|(n-n)(ti,Ol'] 

<C (i? + /9)-2E[|eP]+f^(global) 

(8.6) 
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N 



+ h E[\{u-u){tj,xt;i)\H 



j=i+l 



N 



+ h J2 m^^-u){tj-i,xt;i)\H 



{ij_l<T^'^} 



j=i+2 



A refined Gronwall argument. The key idea is to find by induction a 
sequence of constants Cj(l), Cj(2), i € {0,...,A^}, sucli that, for any ^ G 
L^(r2,^t. ,P) with values in Cj, 



Thanks to Lemma 8.1, we are able to build two sequences Cj(l) and Cj(2), 
z G {0, . . . , A^}, satisfying (8.7) and uniformly bounded by a constant C. 
Choosing i = and = G Cq, we then complete the proof of Theorem 
4.1. The explicit construction of Ct(l) and Cj(2), i G {0, ...,A^}, is given in 
the electronic version [see (8.12) in there]. 

8.3. Proofs of Theorems 4.2 and 4.3. We turn to the proof of Theorems 
4.2 and 4.3. The initial condition of the process X is given by Xq = xq, 
xq G Co, as in (3.12). 

Proof of Theorem 4.2. Prom inequalities (8.2) (deriving from the 
stability theorem), (8.5) (probability of hitting the boundary of the grids) 
and (8.7) [estimate of u — li, recall that Cj(l), Cj(2), j g{0, ...,A^}, are 
uniformly bounded]. Theorem 4.2 holds with t'(tj,Xt. )l|j^<;^^} instead of 
v{ti,Xt^). Since v is bounded (see Theorem 2.1) and since the probability of 
hitting the boundaries of the grids is controlled [see again (8.5)], we easily 
complete the proof. □ 

Proof of Theorem 4.3. It just remains to study the convergence of 
(Xfj. , i^fc, ■^tfe)o<tfc<TooAT toward the solution {U, V, W) of (E). Thanks to the 
Lipschitz properties of b and a, we first deduce by standard computations 
(see, e.g., the proof of Lemma 8.1) the analogue of Proposition 7.1. 

Proposition 8.2. There exists a constant Cg 2 fo'^ A; G {1, . . . , N}, 



(8.7) 



E[|(n-tx)(t,,Ol'] 

< Q(l)£:2(global) + c,{2){R + prMiW 



mxr, - u. 



2 



< Cg_2 IP{^oo < +00} + (global) 

(8.8) 
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+ hJ2[nMt,<r^}i\Xt, - Ut^\' + \iu-u)it, + l,Xt^)\' 
j=0 

+ \{v-v){t„Xt^)f)]]. 

Recall now from Proposition 7.2 (estimate of v — v), Theorem 4.2 (L^ 
estimate of v — v) and (8.5) (probability of hitting the boundary of the 
grids): 

k-l 
3=0 



<c 



(8.9) 



<5^(time) +<?^ (space) + (gradient, p) 



< C[f 2 (global) +P{too < +oo}] < C7£:2 (global). 

Apply now inequality (8.7) (estimate of -u — n) and (8.9) to (8.8) and deduce 
from Gronwall's lemma that sup;jg|o^...^Ar} IE|Xt-j, — ?7rfeP < C<?^(global). Fi- 
nally, according to Theorem 2.1, to Theorem 4.2 (L^ estimate oiv — v) and 
to (8.7), we deduce the following intermediate estimate: 



^.10) 



sup E[|X,, + 
fce{0,...,Ar} 



Tfc ^fc I 



+ hY, nZt, - m/l{t,<.^}] < C£:2(global). 

Applying Doob's inequality, we derive the same bound but with the supre- 
mum inside the expectation. It finally remains to prove the same result, but 
with iUtf^,Vt^,WtJo<k<N instead of iUrk,Vr^,Wt^l{t^:^^^})o<k<N■ Since the 
same arguments apply for V and W, we just detail the case of U. Note indeed 
that, for every fc G {0, . . . , N}, 

sup \Xr^^ - Ut,^ \^ <C sup - Ur^ \^ + C SUp | Ur^ - Ut^ f. 



kG{0,...,N} ke{0,...,N} ke{0,...,N} 

Thanks to the Burkholder, Davis and Gundy inequalities, it is readily seen 
that 



E 



sup \Ur,-UtJ 
.ke{0,...,N} 



< CE[{tN - roo)l{.^<+oo}] < CrP{roo < +oo}. 



Referring to (8.5), we easily complete the proof of Theorem 4.3. □ 
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9. Conclusion. As a conclusion, we first give in Section 9.1 further com- 
ments on Theorem 4.1 and compare, in particular, the global error with 
the one obtained by Douglas, Ma and Protter [10]. We then give some easy 
extensions in Section 9.2. Finally, we detail in Section 9.3 the technical dif- 
ficulties associated with the natural algorithm (3.9)-(3.10). 

9.1. Comments and comparisons with other methods. We discuss in this 
subsection the total complexity and the rate of convergence of Algorithm 3.1. 

Complexity of the algorithm. Note first that the order of the total com- 
plexity of the algorithm is x M x {26~^{p + R))'^. 

Rate of convergence. Recall also that the global error of the algorithm 
is given by Theorem 4.1. Comparing with the results in [10], this global 
error is worse in our case. There are two reasons to explain this difference. 
The first one does not depend on the algorithm, but is a consequence of our 
working assumptions. Indeed, under suitable smoothness properties of the 
coefficients 6, /, cj and of the solution u, standard Ito developments in -D(l) 
(see Lemma 7.4) would lead to <5^(time) = /i^ as in [10]. 

At the opposite, the second reason for which the global error is worse, 
in our case, depends on the specific structure of the algorithm. Indeed, our 
choice to avoid linear interpolation procedures induces a rather large pro- 
jection error (space). To reach a term of order one with respect to h for 
S"^ (space) , we then need to take 6 = h?!"^ . This choice is far from being satis- 
factory and highly increases the complexity when the dimension d increases. 
Intuitively, there is no specific reason for such a relationship between 6 and 
h: as explained in Section 4.1, b has just to be small in front of h to take into 
account the influence of the drift h at the local level. For this reason, we aim 
to study in further investigations the convergence analysis of the algorithm 
when using a suitable "smooth" interpolation operator instead of a rough 
projection mapping. This point is discussed in a detailed way in the next 
subsection. 

Further comments on errors. To conclude this subsection, we investigate 
the three last error terms, i?(trunc), f (quantiz) and i5(gradient,p). 

The truncation error decays linearly when the grid size increases. This 
control may seem rather poor to the reader. Recall indeed that i5(trunc) 
appears, up to the discretization procedure, as the probability that a dif- 
fusion process leaves a given bounded set. In the case of elliptic diffusions 
with bounded coefficients, it is well known that this probability decays ex- 
ponentially fast as the size of the underlying set increases. Recall in this 
frame from Theorem 2.1 that the coefficients of the elliptic diffusion U are 
bounded. Note, however, that this rough argument fails in the discretized 
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setting since there is no a priori sharp estimate of the approximate gradient 
V and thus of the associated approximate drift. This explains why our strat- 
egy to estimate <S(trunc) hes on the Bienayme-Chebyshev inequaUty and, 
thus, provides the current form given by Theorem 4.1. Similar techniques 
could yield a polynomial decay for every q>l, the constant of the theorem 
being an increasing function of q [see Lemma 8.1 and (8.5)]. 

Note finally that the errors associated to the quantization procedure, 
i5^(quantiz), and to the probabilistic approximation of the gradient, 

(gradient, p), are explicitly controlled in terms of M, h and 6. They em- 
phasize the price to pay to weaken the assumptions: we have to assume that 
the quantization grid is rather small compared to the spatial discretization 
one. Obviously, this increases the number of elementary operations of the 
algorithm and, thus, its total complexity. However, this does not affect so 
much the discretization procedure of the Gaussian law itself since quanti- 
zation grids can be computed once for all apart from the implementation 
procedure of the algorithm. 

9.2. Extensions and further investigations. We now discuss some possi- 
ble extensions of our work. 

Interpolation procedure. As stated later in this subsection, we first in- 
vestigate the assets and liabilities of a smooth interpolation procedure. One 
of the main advantages of the spatial discretization proposed in Section 3.4, 
and then used in Algorithm 3.1, lies in its simplicity of implementation. 
However, from a purely mathematical point of view, this procedure may be 
rather awkward since it ignores more or less the deep smoothness of the true 
solution u. 

Note in this framework that the function Hoo may be seen as an opera- 
tor acting on functions from M*^ into R. For such a function, the operator 
provides a rough interpolation of order depending on the values of the 
function on the spatial mesh Coo- As mentioned above, this interpolation 
procedure does not preserve the smoothness properties of the underlying 
function: in any cases, except if the function is constant, the interpolation 
procedure induces jumps of size of order 6. As a consequence, the distance 
between the function and the interpolated one is also of order 6. 

A relevant strategy would consist in replacing the projection Hoo by a 
smoother interpolation operator. In our framework, an interpolation opera- 
tor is said to be smooth if the distance between a given function £ and the 
interpolated one decreases with the regularity order of i. For example, in 
dimension 1, the linear interpolation operator, 

£^{x^6-\6 + 6l6-^x\ -x)£{6l6-^x\) + 6-\x-6l6-^x\)l{Sl6-^x\+5)), 
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maps a C2(IR,M) function into a piecewise smooth function and the distance 
between them is of order 5"^ . 

Algorithm 3.1 can be written with respect to this new choice, but we also 
believe that the proof would be more difficult to detail. Moreover, smooth 
interpolation procedures in higher dimension slow down the running of the 
underlying algorithm. 

Weakening assumption. Note to conclude this subsection that some as- 
sumptions could be weakened. First, Theorem 2.1 still holds if h and / are 
just Holder in x: in such a case, usual estimates of the gradient of u hold 
and Schauder's theory still applies. In particular, the reader can verify that 
Theorems 4.1 and 4.2 are still valid in this case (but Theorem 4.3 given in 
Section 4.3 is not). 

Moreover, Algorithm 3.1 still converges if 6,/ and cr depend on t in a 
Holder way. 

Finally, the following extension is conceivable. For if € C^~^", a g]0,1[, 
the partial derivatives of order two of u have an integrable singularity in 
the neighborhood of T. In this frame, it would be interesting to adapt the 
Gronwall arguments given in Section 8. 

9.3. Justification of Algorithm 3.1. We finally explain why we are not 
able to show the convergence of Algorithm (3.9)-(3.10). 

Convergence of algorithm (3.9) -(3.10). Recall that the main difference 
between the algorithm (3.9)-(3.10) and Algorithm 3.1 lies in the definition 
of the forward transitions. Indeed, in the algorithm (3.9)-(3.10), 

T{tk,x) = b{x, u(tk+i,x),v{tk+i,x))h + a{x, u(tk+i,x))g{AB''), 

Xo = xo, yke{o,...,N-i}, Xt,^^=Uk+i{Xt,+Titk,Xt,)). 

Unfortunately, in this case, the well-known BSDE machinery fails under As- 
sumption (A). At first sight, this could seem rather amazing. Indeed, recall 
that very strong a priori estimates of the solution u and of its partial deriva- 
tives hold in our framework. In particular, we could expect the discretization 
procedure of u and of its gradient to converge under such smoothness prop- 
erties. 

The main difficulty encountered to establish the convergence of the algo- 
rithm (3.9)-(3.10) appears in Section 7. The lack of a priori controls of the 
regularity of u and v makes the stability strategy fruitless. Note, indeed, 
that inequality (7.1) becomes in the frame of the indicated algorithm 

N 

\{u-u)iO,xo)\' + C^\hY,E[\{v-v)it,.,,Xt^^,)\\t^_^^r^}] 
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<C^7.1 



{''"oo < +00} + £^{time) + (space) + <5^(quantiz) 

N 



(9.1) +v-'hY,E[\iu-u){t„Xt^_,)\%,^_^^,^y] 

i=i 

N 

+ r/-i/i^E[|(n-n)(t,„i,Xi^_J|'l|i^_^<,^|] 
i=i 

N 

+ irj + h)hJ2E[\iv-v){tj,Xt^_,)\\,^_^^,^}] 

Inequalities (7.1) and (9.1) just differ in tlie last term: v{tj-i,Xtj_j^) becomes 
v{tj, Xt^_j^). Note that to be complete a similar shift occurs in v but, due 
to Theorem 2.1, it can be removed without any difficulties. To apply the 
strategy used in Section 7, and, in particular, to derive an equivalent of 
Theorem 7.3 from (9.1), we then need to investigate the regularity in space 
of V. According to the definition of -u, a first step then consists in studying 
the regularity in space of u. 



Lipschitz control of u. Note that the natural strategy to control the 
oscillations of u would consist in applying the usual FBSDE machinery to 
the triples (x*^'^, y*'='^, Z*"'^) and {X^k,y ^y^k^y ^ z^^'^y) for A; e {0, . . . , - 1} 
and x,y£Ck- Of course, superscripts {tk,x) and {tk,y) denote the initial 
conditions of the Markov process X. 

Nevertheless, we are not able to apply the strategies used in [6, 7] to 
derive from the forward-backward writing local and global estimates of the 
discrete gradient of u. There are two reasons to explain this failure. 

First, the rough projection mapping chosen induces an irreducible error 
greater than 5 when estimating the difference between u{tk,x) and u{tk,y) 
in function of the parameters deriving from Assumption (A). The strategy 
to overcome this difficulty is well known: the projection mapping has to be 
replaced by a smoother interpolation operator. 

Second, any probabilistic strategy to estimate the Lipschitz constant of 
u in X such as the one exposed in [6] leads one way or another to the same 
difficulty as the one encountered to apply the stability procedure to the 
algorithm (3.9)-(3.10). More precisely, studying the difference between the 
triples (x*fe'^,y*fc'^,Z*'='^) and {X^k,y ^Y^k,y ^ z*>"y), for e {0, . . . , A^ - 1} 
and x,y £Ck, leads to investigate the regularity of v. In short, one needs to 
estimate first the regularity of v to derive the one of u. Intuitively, it is well 
understood that this is hopeless. 
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